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An integrated European Research Area (ERA) is a critical component for a 
more competitive and open European R&D system. However, the impact of 
EU-specific integration policies aimed at overcoming innovation barriers asso- 
ciated with national borders is not well understood. Here we analyze 2.4 x 10^ 
patent applications filed with the European Patent Office (EPO) over the 25- 
year period 1986-2010 along with a sample of 2.6 x 10^ records from the ISI 
Web of Science to quantitatively measure the role of borders in international 
R&D collaboration and mobility. From these data we construct five different 
networks for each year analyzed: (i) the patent co-inventor network, (ii) the 
publication co-author network, (iii) the co-applicant patent network, (iv) the 
patent citation network, and (v) the patent mobility network. We use meth- 
ods from network science and econometrics to perform a comparative analy- 
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sis across time and between EU and non-EU countries to determine the "treat- 
ment effect" resulting from EU integration policies. Using non-EU countries as 
a control set, we provide quantitative evidence that, despite decades of efforts 
to build a European Research Area, there has been little integration above 
global trends in patenting and publication. This analysis provides concrete 
evidence that Europe remains a collection of national innovation systems. 

Efforts towards European research and development (R&D) integration have a long history, 
intensifying with the Fifth Framework Program (FP) in 1998 and the launch of the 

European Research Area (ERA) initiative at the Lisbon European Council in 2000. A key com- 
ponent of the European Union (EU) strategy for innovation and growth (|?||5]) is the ERA aims at 
an integrated innovation system through directed funding, increased mobility, and streamlined 
innovation policies that can overcome national borders. 

To assess the rate of progress towards this ERA vision, we analyze the evolution of geo- 
graphical collaboration networks constructed from patent and scientific publication data. While 
these data may not capture every facet of ERA, they are widely accepted measures of R&D 
output and the European Commission considers them crucial for the evaluation of the Horizon 
2020 FP @. All in all, we find no evidence since 2003 that EU innovation policies aimed 
at promoting an integrated research and innovation system have corresponded to intensified 
cross-border R&D activity in Europe vis-a-vis other OECD countries. 

We exploit the June 2012 release of the OECD REGPAT database 0, and analyze all 
~ 2.4 X 10^ patent applications filed with the European Patent Office (EPO) over the period 
1986-2010. For comparison with scientific publications we take a random sample of ~ 2.6x10^ 
records from the ISI Web of Science over the period 1991-2009. We geo-coded each data set at 
the NUTS3 region level (see Supplementary Materials (SM)). Using the data we construct 5 net- 
works, which provide different perspectives into EU R&D integration. In our networks, nodes 
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correspond to NUTS3 regions and links represent collaboration/mobility measures. Specifi- 
cally, (i) the patent co-inventor network and (ii) the publication co-author network measure 
the intensity of inter-regional collaboration at the individual level; (iii) the co-applicant patent 
network measures the collaboration between institutions ("applicants") located in different re- 
gions; (iv) the patent citation network indirectly measures scientific integration by following the 
flow of citations from patents in one region to patents in another; (v) the patent mobility net- 
work measures the mobility of inventors from one region to another by tracking their location 
in subsequent patents. 

We use a standard network-clustering algorithm to identify communities, i.e., sub- sets of 
nodes more strongly linked to one another than to nodes outside, to compare geopolitical bor- 
ders and R&D networks. Regional integration is shown in Fig. [T]in the purple community, 
centered on Eindhoven, which is composed of strongly collaborating regions in the Benelux, 
and in the international Nordic community with its center in Copenhagen. Despite these excep- 
tions, patterns of co-inventorship in Europe continue to be largely shaped by national borders. 
This observation stands in contrast to the community structure of the highly dispersed "coast- 
to-coast" US co-inventor network (see SM for comparison) @. Figure [T] shows Europe as a 
collection of regional and national innovation communities. However, that does not necessar- 
ily mean that integration efforts have been unsuccessful. The more relevant question, then, 
is at what rate is Europe evolving toward an integrated research system relative to the rate of 
cross-border R&D collaboration observed in non-EU OECD countries. 

Consistent with recent studies (j5]|9 iO ii I we observe a significant increase in the total 
number of cross-border research collaborations, both within and outside Europe (see SM). To 
assess the role of EU-specific factors, we compare the relative change in cross-border collabora- 
tion between European countries (e.g., distinguishing German-French, from German- German, 
and French-French collaborations) to the relative change in cross-border collaboration between 



3 



non-European OECD countries (e.g., distinguishing USA-Japan from USA-USA, and Japan- 
Japan collaborations). Collaborations between EU and non-EU regions are not included in our 
analysis. 

For each network, our econometric model simultaneously performs three quantitative differ- 
ences and controls for the size of regions, geographic distance and time effects (see SM). First, 
the difference between cross-border and intra-border average number of links is computed, both 
for EU and non-EU OECD nations. Second, the difference between these two estimates isolates 
the impact of EU-specific factors on R&D integration. The final one, to a baseline year, yields 
the quantitative output of the model, i.e. the expected number of additional links between re- 
gions resulting from EU specific factors. This quantity is shown in Figure |2] Comparing data 
points from two different years, a higher y-axis value indicates a greater impact of EU specific 
factors upon integration among EU nations. Thus by construction, choice of the baseline year 
does not alter our results. It also follows that a positive (negative) slope indicates Europe is 
integrating faster (slower) than non-EU countries. 

Since the late 1990s, we observe some signs of integration in European patent statistics. In 
the case of the patent co-inventor network, there has been an increase in cross-border collabo- 
ration in Europe vis-a-vis other OECD countries. This effect was relatively pronounced from 
1998 to 2002, but stalled in 2003. Since then, the additional number of links for an average 
pair of regions due to Europe specific factors has never been significantly larger than zero. The 
patent co-applicant network exhibits no significant increase since 1996. The citation network 
shows a temporary bump in integration in the late 90's, then fluctuates around that level. Finally, 
the inventors' mobility network shows almost no progress in the last decade, confirming a slow 
pace of integration for the European high- skill labor market. 

The scientific publications co-authorship network shows a negative trend since 1999, in- 
dicating that cross-border links among non-EU OECD countries grew faster than European 
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cross-border links. These results are striking and deserve further investigation given the amount 
of resources the EU has committed to promote cross-border scientific collaboration through 
programs like FPs, European Cooperation in Science and Technology, Networks of Excellence, 
Marie Curie Actions, etc. 

In sum, our analysis of R&D patent and publication networks shows Europe remains a 



collection of loosely coupled national innovation systems (12). Furthermore, since 2003, cross- 
border collaborations in Europe have developed no faster than in the rest of the OECD countries. 
Several ongoing initiatives seek to address a number of general shortcomings that have affected 
previous integration efforts (5). The European Institute of Innovation and Technology's (EIT) 
Knowledge and Innovation Communities are long-term (7-15 years) collaboration networks 



spanning all aspects of the R&D ecosystem ( |75| ). To foster synergetic interaction between 
national funding bodies. Science Europe, an association of national research organizations, was 



founded in 2011 ([74]). 

The European Research Council ( [75] ) has taken major steps to promote cross-border mo- 
bility by making grants competitive and portable. Likewise, a memorandum of understanding 
signed by the European Commission and the League of European Research Universities (|73]) 
pushes for pension unification and transparency in hiring and tenure decisions. 

Despite these initiatives to increase competition within the system, monitoring and eval- 
uation must drastically change if Europe is to accomplish its ambitious goals in Science and 
Technology. Evidence based evaluation focused on output and impact is crucial, as recognized 
in the plans for the Horizon 2020 FP (6). Our methodology promotes this vision by combining 
interdisciplinary expertise with data relevant to evaluation. 
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Figure 1: The community structure of the 2009 EU-15 co-inventor network. Communities are 
shown with different colors and are labeled by their most central region. Communities have 
been generated by iteratively aggregating nodes (NUTS3 regions) into clusters of increasing 
size (see SM). Blank regions have no ties in 2009. 
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Figure 2: The evolution of European integration in five R&D networks. We use econometric 
methods to measure the effect of EU specific factors on the amount of cross-border links relative 
to within-border links and to the rest of non-EU OECD countries. Results are shown for 4 
different patent networks (black circles) and a scientific publication network (green circles). 
Open circles indicate statistically significant (.05 level) positive deviations from the baseline 
year (2003). The y-axis reports the additional number of links for an average pair or regions 
relative to 2003 due to R&D integration in Europe. 
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1 Materials and Methods 

We perform a geo-spatial network analysis of scientific collaboration. In our framework, the 
nodes are NUTS3 regions]] In our analysis of R&D integration, we distinguish between two 
types of collaboration links: (a) links between NUTS3 within the same country, and (b) cross- 
border links between NUTS3 regions in country m and NUTS3 regions in country n, as demon- 
strated by the green links in Fig. [STj^A). Fig. [STj^B) outlines our methodological approach where 
we analyze and compare the time evolution of collaboration networks in EU countries vis-a-vis 
non-EU countries. We use an econometric model to measure the difference between the network 
structure in year t* + At and the "baseline year", which we choose to be t* = 2003. 

Supplementary materials are organized as follows: the first section describes our data sources 
and database construction; the second section illustrates the network clustering methods we em- 
ployed; and the third section contains a detailed description of our statistical methodology and 
results. All relevant Data and Code can be downloaded at: 

http : //cse . lab . imtlucca . it /SOM/SOM. zip 

'The Nomenclature of Units for Territorial Statistics (NUTS) is a geo-code standard for referencing the subdi- 
visions of countries for statistical purposes. The nomenclature has been introduced by the European Union, for its 
member states. The OECD provides an extended versions of NUTS3 for its non-EU member and partner states. 
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1.1 Data 



Patent collaboration data are drawn from the OECD REGPAT database (0[7^ which compiles 
all patent applications filed with the European Patent Office (EPO) since the 1960s. Within 
this database the geographical location of inventor and applicant are designated by one of the 
5,552 NUTS3 regions in 50 countries]^ We use all patent applications across all classes in the 
REGPAT database over the period 1986-2010, 2.4 x 10^ applications overall]^ We construct 
4 geographical networks: (i) co-inventors, (ii) co-applicants, (iii) citations and (iv) inventor 
mobility. In (i) and (ii) the strength of a link between two regions is equal to the number of 
patents jointly invented by or jointly assigned to the two regions. In (iii) it is the number of 
patent citations between inventors' regions. Specifically, for each pair of NUTS 3 regions 
we count the number of times that (a patent invented by an inventor residing in) region i cites 
(a patent invented by an inventor residing in) region j. Conversely the number of citations that 
i receives from j is the strength of the link (j, i). In (iv) link weight is equal to the number of 
inventors moving from region i to region j. As for citations, the mobility network is directed, 
i.e. we distinguish between mobility from i to j and mobility in the opposite direction, j to 
i. Links are created tracking regional migration for inventors with at least two patents. We 
compare the affiliation of inventors' consecutive patents and assign a new link whenever a new 

^In our analysis we considered 40 countries. European countries consists of the EU-15: Austria, Belgium, 
Germany, Denmark, Spain, Finland, France, United Kingdom, Greece, Ireland, Italy, Luxembourg, Netherlands, 
Portugal, Sweden. The control set is comprised of 25 other nations outside of the EU-15: Australia, Bulgaria, 
Brazil, Canada, Switzerland, Chile, China, Hong Kong, Croatia, Israel, India, Iceland, Japan, South Korea, Liecht- 
enstein, Macedonia, Mexico, Norway, New Zealand, Romania, Russian Federation, Turkey, Taiwan, United States, 
South Africa. EU-15 Gross Domestic Expenditures on R&D (GERD) spending is 2.6% of their combined GDR 
In the case of the non-EU control set, that number is 2.1%. Likewise, the distribution of GERD within each set 
is similar, with a mix of high and low spending countries. On average for EU-15 countries, 15% of the R&D 
budget comes from the EU and the remaining 85% from national budget. Conversely, while statistical figures are 
not available the shared R&D budget in non-EU OECD countries is considerably smaller 

^Data for 2010 might be incomplete as some EPO filings are published with lags and may not appear in the 
data yet. 
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patent is filed in a region different from the one reported in the inventor's previous patent. |^ 

Fig. [STJC) illustrates the global trend of increased cross-border collaboration in the co- 
inventor network, and increased cross-border flow in the mobility network. We count the total 
number of intra-border collaborations links Ni = J^n^nit) and the total number of cross- 
border collaboration links A^x = J2m n ^rn,n and define the collaboration share to be the ratio 



5* = Ny;/{Ni + A^x) for a given time period (values shown in Fig. SI ^C) and values for the 



other networks are listed in Table SI ). The overall increasing trend reflects both the increasing 
pace of patenting and the decreasing role of distance in worldwide research efforts. We note 
that for the case of Europe the 15% final share matches the ratio of the EU research budget to 
the combined national research budgets of EU nations. 

Scientific publications data are drawn from ISI-Web of Science. The Web of Science 
database is a bibliographical collection maintained by Thomson Reuters, considered to be one 
of the most comprehensive and reliable sources of information on research activity across all 
fields of science. We analyze a random sample of 256,015 research articles in the period 1991- 
2009 by authors affiliated at institutions located in the OECD countries. We build the regional 
co-authorship network by geo-coding each address attached to the paper. Since addresses refer 



to institutional affiliations and it is not possible to link individuals to organizations ( [7q ), we de- 
fine co-authorship as the co-occurrence of two or more addresses on a publication. Therefore, if 
an author lists multiple affiliations in different regions we consider co-authorship links between 
those regions in our analysis. 

1.2 Community detection 



There are now many community detection methods for clustering networks (19), one the most 



popular being modularity optimization, introduced by Newman and Girvan (j20l). Some limita- 



"^OECD REGPAT database provides a unique identifier for inventors' name. For more detailed information on 
patenting activity the reader can refer to a survey of inventors for around 9,000 European patented inventions (35). 
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tions have been noted for this method, the most important being the existence of a resolution 
limit ( [27] ) that prevents it from detecting small modules. Nevertheless it is reliable for standard 
cluster analysis provided a suitable optimization procedure is employed. In the present analysis 
we adopt a weighted version of the modularity function and optimized it using the Louvain al- 
gorithm ( [22] ). This algorithm arrives at the final community structure by starting from isolated 
nodes (NUTS-3 regions in our case) and iteratively aggregating them into communities of in- 
creasing size. This particular optimization procedure can mitigate the effect of the resolution 
limit. 

After determining the community structure we calculated the centrality of each node within 
a community using a novel perturbative approach. Since we obtain the modularity score of a 
network (Q) by an optimization procedure, every perturbation of the partition structure leads 
to a negative variation in the modularity (dQ). For every node we calculate a dQ by moving 
the node into every other community in the network. Within a specific community, the node 
with the most negative dQ is defined as the most central node (core region). The legends of 
Fig. [T]in the manuscript and Fig. S2 below identify the most central nodes (using the city 
name associated with the NUTS3 region) for the top 13 communities in the 2009 co-inventor 
networks, for Europe and the USA respectively. 

The community structure of the top 13 communities in the USA co-inventor network in 



2009 is shown in Fig. S2 Green arcs have been added to highlight (some of) the long range 
connections of the community for which San Francisco is the most central region. Communities 
in the USA have a higher fraction of cross border links than in the EU, indicating drastically 
different levels of integration in their respective R&D collaboration networks. This result is 
confirmed numerically by comparing the share of links with at least one region outside of the 
nation in which the core region is located for Europe versus the share of links with at least on 
member outside the state in which the core region is located for US communities. This share 
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is always significantly higher in the US co-inventor than in the EU. Table S2 shows that the 
fraction of cross-border ties is on average larger for the US (0.706) as compared to the EU 
(0.138), with t-statistic 7.16 {p < 0.01). The only European cluster which has a cross-border 
connectivity comparable with the US ones is the Nordic cluster centered on Copenhagen, which 
has many members outside Denmark in Sweden and Finland. 

1.3 Statistical analysis 

The rate at which EU (NUTS3) regions are linking to regions in other EU countries is increasing 
due to two types of factors: those that are global and those that are EU specific. Thus, to capture 
the effect of EU specific institutional factors we must account for the net effect of the global 
factors. In technical terms, we use the non-EU OECD members as a control group and its 
behaviour serves as the counterfactual behavior of EU regions 

In our statistical analysis the number of links (?/j = A^^i) between NUTS3 regions k and / 
is regressed on a set of independent variables. We model this dependent variable with a count 
density. A number of models can be found in the literature to handle count densities, including 
the Poisson model. Negative Binomial model variants, and Zero-inflated models (i§ |25 24 25 



|2]|7^|5]). Since ~ 90% of our link counts are zero, we opted for a zero-inflated negative binomial 



(ZINB), as consistent with (26 23«r\ Zero-inflated models allow zeros to be generated by two 



^Given a general model with two state indicators A and B and two periods such as 

y^f3o + PidA + l32dB + P^^dA * dB + 5nd2 + did2 *dA + ^2^2 * dB + 53^2 *dA*dB + u, 

it can be easily shown that in a linear setting with no further explanatory variables the OLS estimate of the coeffi- 
cient of the triple interaction term is just 

S3 — {yA,B,2 — yAM,l) — {yNAM,2 — yNA,B,l) — {yA,NB,2 — yA,NB,l), 

where NA and NB indicate respectively the states not in A or not in B (26). Underlying this analysis is the way 
we model the process that generates the link counts (yi). 

*In the case of the inventor mobility network (and only that case) the number of non-zero link counts was too 
low to be modeled using ZINB. Rather than tinkering with the threshold, we modeled only the pairs of regions 
with 2/i > 0. A Zero Truncated Poisson model was employed in this special case. 
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distinct processes and are generally used when data exhibits "excess zeros" (27). The ZINB 
model supplements a count density, P with a binary zero generating process 'ip. This allows a 
zero count to be produced in two ways, either as an outcome of the zero generating process with 
probability tp, or as an outcome of the count process P provided the zero generating process 
did not produce a zero (ipi = 1). 

The density distribution for the count pair yi is then given by 

P(y,) = (l-^,)*P(y,), (SI) 

where the zero generating process tpi is parameterized as a logistic function of the regressors in 
Zi, with parameter vector 

= f7 oov (S2) 

The count process P{yi) is modeled as Negative Binomial of the second kind (NB2): 

T{yi + 1) + T{a i) ^ + fJ'iJ \a ^ + fiij 
where the conditional mean /ij is parameterized as an exponential function of the linear index 



X(3^, and a{> 0) is the overdispersion parameter. Thus, drawing together equations SI S2 



and S3 our model for the expected count is 



Z,) =(l- ^^PiM!!) , exp(X,/3i). (S4) 
V 1 + exp{Zil3^) J 

In our estimation procedure we assume Xi = Zi because there is no reason to expect some 
variables would be relevant only in one of the two processes. However, individual regressors 
can impact the y^ estimator differently through the two distinct processes and their separate 
parameter vectors, /3° and 

The linear indices for the zero-generating process and for the Negative Binomial 
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process are modeled in parallel as 

X(3^ = /3q + (3{border + f32eu + l3^Distance + PlSizck + P^Sizei + •j^border * eu+ 

T T T T 

+ 9{yeart + 6{border * yeart + (t^u * yeart + r]{border * eu* yeart, 

t=2 t=2 t=2 t=2 

(S5) 

where j = 0, 1. Sizek and Sizei denote the size of each of the two regions. We proxy the size 
of a region by the total number of links attached to the region. Distance is the geographical 
distance between centroids of the regions and yeart is the year dummy variable. The time in- 
terval for estimation is generally 1986-2010 for patents and 1991-2009 for publications, border 
flags pairs of NUTS3 regions within the same country, eu flags pairs of NUTS3 regions that are 
within the EU {eu = 1) and pairs of NUTS3 regions for which neither are in theEU(eM = 0).[| 
Cross-sections are pooled over years and estimation is carried out on the whole sample cluster- 
ing standard errors at pairs of NUTS3 regions. Following the Difference-in-Differences (DiD) 
econometric strategy, the full set of double/triple interaction dummy variables among the three 
dimensions (eu = {0, 1}, border = {0, 1}, yeart = {0, 1} for t = 2, . . . , T) is relevant to the 
identification of treatment effect. 

In the literature on program evaluation, DiD estimation is one of the most popular strategies 



for identifying the impact of a policy or treatment (29 30 31 32). Treatment effect on an 
outcome variable is, in general, defined as the difference between the outcome actually observed 
under the treatment and the counterfactual, that is the outcome that would have been observed 



without treatment (31 ). Under this treatment-effect framework, our analysis seeks to quantify 
the effect of EU institutional changes upon integration within the EU, by measuring the relative 
rate of cross-border links within a given network. Moreover, to isolate the signal arising only 
from EU factors we must control for the global rate of cross-border integration. Specifically, we 

extend the standard DiD strategy of one state indicator (treatment vs control group) to the case 

^Note that this dummy variable (eu) does not account for pairs of regions for which one is in the EU and one is 
not. Such Hnks are not included as they are simply not relevant to the comparison we are focusing on. 
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of two state indicators, providing a control group of links between non-European countries. For 
the purpose of embedding the institutional comparison in a temporal perspective, our analysis 
also includes a time variable, in line with the standard treatment-effect formalism. Due to the 
addition of a second state indicator our approach is a Difference-in-Differences-in-Differences 



estimator (DiDiD) (26) 



While the standard version of DiD estimation is designed for the linear case, it can be 
extended to cases in which the outcome variable is non-continuous and nonlinear estimation is 
preferred, as in our case ( [53] ). When employing the ZINB model it is not possible to make a 
general statement regarding the sign of the treatment effect merely by checking the sign of the 
interaction term(s) coefficients. However, we can identify the treatment effect by calculating 
the incremental effect of the interaction term through comparison of a given year to a baseline 
year. In our framework, treatment effects are incremental effects of the triple interaction terms 
border * eu* yeart, evaluated at means of the regressorsj^ 

Denoting the actual and counterfactual outcomes of our count dependent variable as and 
y° respectively and taking into account our DiDiD extension, the yearly treatment effect (rt) 
can be defined as 

Tt{yeart = l,eu = 1, border = 1, M) =E[Y^\yeart = l,eu = 1, border = 1, M] 

(S6) 

— E[Y^\yeart = l,eu = 1, border = 1, M], 
where M is the matrix of controls (Sizek, Sizei, Distance). Given the linear indices modeled 



in Eq. S5 the expectation values of Y^ and for the group under treatment are 

ElY-^lyeart = l,eu = 1, border = 1,M] = 

exp(0i + w^M + r]j) (S7) 
1 + exp(0O + oj^M + 77°) ' 

ElY-^lyeart = l,eu = 1, border = 1,M] = 

exp(0i + u^M) (S8) 
1 + exp(0O + wOM) ' 



See (25 1 for the computation of marginal effects for the ZINB model. 
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where 0-^ = + + + 7-' + 6l + 6^ + C,l, and where cu^ and w° are the coefficient vectors 

as 



for M (the controls). We can then write Eq. S6 



1 + exp(0O + cjOM + r/?) 1 + exp(0O + u^M) ' 

(S9) 



We calculate the yearly treatment effect in Eq. S9 using parameter estimates of (3^ and 
(3^ and the sample mean of M. Estimates for the parameters in vectors /3° and are obtained 
through maximum likelihood. Although differences in the linear indices {r]l and 77°) are constant 



across individuals for a given t, it is clear from Eq. S9 that differences in the dependent variable 
depend on the values chosen for M. 

Relative to the baseline year t* (we use the year 2003 in our analysis as indicated by red 
dots in Fig. [2] in the manuscript), the yearly treatment effect reflects the impact of changes in 
institutional factors specific to the EU which have taken place in a given year t. Estimates of 
T, which are just marginal effects of the triple interaction term border * eu* yeart- Estimates 
are obtained averaging over all the variables in the model and thus refer to an "average" pair of 
regions. 

Due to the large number of zero entries, in the regression analysis we ignore regions with 
fewer than 50 total patents. For inventor mobility, the analysis focuses only on NUTS 3 region 
pairs with at least one link (nonzero counts). Since in co-authorship network the fraction of 
zeros is lower, we do not use any cutofffl 



"^Additional results at the more aggregate level of NUTS2 Regions and with different cutoffs have been pro- 
duced and confirm our main findings. In other words, our results do not critically depend on the definition and 
level of aggregation of administrative regions we consider They are made available upon request. 
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Table S2: Measurement of cross-border share of communities found in the coinventor net- 
works for Europe and the USA in 2009. Data are the same as in Figs. [T] and S2 The t- 
statistic is computed on the difference between averages. For the average percent multistate 
value 0.706, t = 7.16(P < 0.01). 





Intra- 


Multi- 


Percent 




Intra- 


Multi- 


Percent 




country country 


multi- 




state 


state 


multi- 




links 


links country 




links 


links 


state 


Mannheim 


1,612 


12 
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Figure SI: Network analysis of co-patent activity. (A) Schematic illustration of the network 
methodology. For each year t we calculate weighted links from Nn{t), the number of patents be- 
tween NUTS 3 regions within a country n, and iVm„ (t), the number of patents between NUTS 3 
regions in different countries, as indicated by the green links. (B) The evolution of the collabo- 
ration networks over time serves as the basis for analyzing the integration rate of the EU inno- 
vation system, and these within-EU changes over time are compared to non-EU changes over 
time. (C) For the set of EU countries, we show the annual cross-border share S{t) = N^^/Nt, 
calculated as the ratio of the number A^x (^) = n of cross-border collaboration links 
divided by the total number Nrit) = + J2n of both intra- and cross-border collabora- 
tion links. We calculate the same quantity for the set of non-EU countries. The increase of S 
over time in the co-inventor and mobility networks reflects a well-documented increasing trend 
in global patent activity. However, the share difference A = S{EU) — S{nonEU), a coarse 
indicator of relative integration that does not control for EU specific factors, is relatively flat 
for both measures, except for a small "jump" around 1998-2000 in the co-inventor network. 
The relatively constant trend in A(t) is preliminary empirical evidence that brings into question 
the effectiveness of EU policies aimed at accelerating integration. Our econometric "treatment 
effect" approach further investigates the effectiveness of EU integration policies by controlling 



for multiple underlying variables, see Eq. S5. Link count values are listed in Table SI 
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Figure S2: Community structure of the 2009 USA co-inventor network. We show only the top 
13 communities and left regions belonging to all other communities white. The most central 
region of each community is listed in the legend and is determined by the procedure described 
in Section 2. Communities were determined using the Newman Girvan algorithm (21) and the 
Louvain algorithm (23). The green arcs are used to highlight some of the long distance members 
of the community for which San Francisco is the core region. Source: our computations based 
on data and code available here: http : //cse . lab . imtlucca . it/SOM/SOM. zip. 
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